Head Pose Classification in Crowded Scenes

نویسندگان

  • Javier Orozco
  • Shaogang Gong
  • Tao Xiang
چکیده

Human head pose and gaze directions have traditionally been studied for expression and face recognition, and human computer interaction [6]. Accurate estimations of either head or gaze direction can provide useful information for the inference of a person’s intent and behaviour. However, most existing techniques rely upon medium to high resolution images captured under well controlled conditions from a fairly close distance [3, 5, 7, 10]. Given high resolution images, most existing techniques deploy extensive feature extraction to capture detailed head/facial shape and texture information. However, this approach relies on accurate subtraction of head foreground region from the background which is not always feasible. We propose a novel technique for head pose classification in crowded public space under poor lighting and in low-resolution video images. Unlike previous approaches, we avoid the need for explicit segmentation of skin and hair regions from a head image and implicitly encode spatial information using a grid map for more robustness given low-resolution images. Specifically, a new head pose descriptor is formulated using similarity distance maps by indexing each pixel of a head image to the mean appearance templates of head images at different poses. These distance feature maps are then used to train a multi-class Support Vector Machine for pose classification. Our approach is evaluated against established techniques [2, 8, 9] using the i-LIDS underground scene dataset under challenging lighting and viewing conditions. As in [8], a 360◦ head pose in panning angle is discretized into eight pose classes with 45◦ increment. The results demonstrate that our model gives significant improvement in head pose estimation accuracy, with over 80% pose recognition rate against 32% from the best of existing models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online multiple people tracking-by-detection in crowded scenes

Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...

متن کامل

Head Detection in Densely Crowded Scenes

............................................................................. i ACKNOWLEDGEMENTS .............................................................. ii CONTENTS ............................................................................... iii LIST OF FIGURES ....................................................................... vi LIST OF TABLES ......................................

متن کامل

A Structural Filter Approach to Human Detection

Occlusions and articulated poses make human detection much more difficult than common more rigid object detection like face or car. In this paper, a Structural Filter (SF) approach to human detection is presented in order to deal with occlusions and articulated poses. A three-level hierarchical object structure consisting of words, sentences and paragraphs in analog to text grammar is proposed ...

متن کامل

An Adaptation Framework for Head-Pose Classification in Dynamic Multi-view Scenarios

Multi-view head-pose estimation in low-resolution, dynamic scenes is difficult due to blurred facial appearance and perspective changes as targets move around freely in the environment. Under these conditions, acquiring sufficient training examples to learn the dynamic relationship between position, face appearance and head-pose can be very expensive. Instead, a transfer learning approach is pr...

متن کامل

Estimating the Number of People in Crowded Scenes by MID Based Foreground Segmentation and Head-shoulder Detection

This paper proposes a novel method to address the problem of estimating the number of people in surveillance scenes with people gathering and waiting. The proposed method combines a MID (Mosaic Image Difference) based foreground segmentation algorithm and a HOG (Histograms of Oriented Gradients) based head-shoulder detection algorithm to provide an accurate estimation of people counts in the ob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009